pacman:: p_load(sf, spatstat, tmap, rvest, tidyverse)02-2: 2nd Order Spatial Point Patterns Analysis Methods
execute: echo: true #display the code eval: true message: false warning: false freeze: false # true: not render if nothing edited editor: visual —
1 Overview
This exercise uses second-order spatial point pattern analysis and spatstat package to study childcare centres in Singapore. Unlike first-order analysis, which looks only at overall density independently, second-order methods reveal relationships and influences between centres (points) based on distance.
We aim to answer two key questions:
Are childcare centres randomly distributed across Singapore?
If not, where are the areas with higher concentrations?
2 The data
As also used in 1st order SPPA, the following datasets were downloaded from publicly available websites, and both are available in KML and GeoJSON format.
| Dataset Name | Source | Discrption |
|---|---|---|
| Child Care Services | data.gov.sg | Point feature data: contains the locations and attributes of childcare centres. |
| Master plan 2019 Subzone Boundary (No Sea) | singstat | Polygon feature data: represents the URA 2019 Master Plan planning subzone boundaries. |
3 Installing and Loading the R packages
In addition to spatstat, a total of five R packages will be used in this exercise.
| Package | Discription |
|---|---|
| sf | Simple Features, a new R package which handles importing, managing, and processing vector-based geospatial data. |
| spatstat | Provides useful functions for SPPA, which will be called to conduct both 1st and 2nd SPPA and KDE. |
| tmap | Creates high quality static or interactive choropleth maps via leaflet. |
| rvest | Scrapes and extracts data from web pages. |
After installation, we load them into R environment using the code below.
4 Data Import and Preparation
4.1 importing data
The following code chunk shows the steps to first import the Master plan 2019 Subzone Boundary (No Sea) data using st_read, extract the required 4 columns from the Description field, filter out the nearby islands, and finally save the file as mpsz_cl for further analysis.
mpsz_sf <- st_read("data/MasterPlan2019SubzoneBoundaryNoSeaKML.kml") %>%
st_zm(drop = TRUE, what = "ZM") %>%
st_transform(crs = 3414)Reading layer `URA_MP19_SUBZONE_NO_SEA_PL' from data source
`C:\lsrgc\ISSS626-yiqiong-pan\Hands-on_Ex\Hands-on_Ex02\data\MasterPlan2019SubzoneBoundaryNoSeaKML.kml'
using driver `KML'
Simple feature collection with 332 features and 2 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS: WGS 84
extract_kml_field <- function(html_text, field_name) {
if (is.na(html_text) || html_text == "") return(NA_character_)
page <- read_html(html_text)
rows <- page %>% html_elements("tr")
value <- rows %>%
keep(~ html_text2(html_element(.x, "th")) == field_name) %>%
html_element("td") %>%
html_text2()
if (length(value) == 0) NA_character_ else value
}# map_chr of purr (tidyverse) applies a function to each element of a list/vector and returns a character vector.
mpsz_sf <- mpsz_sf %>%
mutate(
REGION_N = map_chr(Description, extract_kml_field, "REGION_N"),
PLN_AREA_N = map_chr(Description, extract_kml_field, "PLN_AREA_N"),
SUBZONE_N = map_chr(Description, extract_kml_field, "SUBZONE_N"),
SUBZONE_C = map_chr(Description, extract_kml_field, "SUBZONE_C")
) %>%
select(-Name, -Description) %>%
relocate(geometry, .after = last_col())mpsz_cl <- mpsz_sf %>%
filter(SUBZONE_N != "SOUTHERN GROUP",
PLN_AREA_N != "WESTERN ISLANDS",
PLN_AREA_N != "NORTH-EASTERN ISLANDS")write_rds(mpsz_cl,
"data/mpsz_cl.rds")The code chuck below imports downloaded ChildCareServices data to R as sf data frame as childcare_sf by using st_read, coverts 3d to 2d (st_zm) and finally transform the CRS from WGS84 to SVY21.
childcare_sf <- st_read("data/ChildCareServices.geojson") %>%
st_zm(drop = TRUE, what = "ZM") %>% # Drop Z and M to convert from multi-dimensional to 2d (XY)
st_transform(crs = 3414)Reading layer `ChildCareServices' from data source
`C:\lsrgc\ISSS626-yiqiong-pan\Hands-on_Ex\Hands-on_Ex02\data\ChildCareServices.geojson'
using driver `GeoJSON'
Simple feature collection with 1925 features and 2 fields
Geometry type: POINT
Dimension: XYZ
Bounding box: xmin: 103.6878 ymin: 1.247759 xmax: 103.9897 ymax: 1.462134
z_range: zmin: 0 zmax: 0
Geodetic CRS: WGS 84
Using the tmap mapping methods, the code chunk below creates a map combining childcare_sf and mpsz_cl.
tm_shape(mpsz_cl)+
tm_polygons() +
tm_shape(childcare_sf) +
tm_dots(size = 0.3)
Alternatively, an interactive thematic map can be plotted using the code below. The interactive map is easy to navigate and query intuitively. It is optional to change the background map layer(choices: ESRI.WorldGrayCanvas(default), OpenStreetMap, ESRI.WorldTopoMap).
tmap_mode('view')
tm_shape(childcare_sf) +
tm_dots() #creates a layer of dots to visualise point data on a map.tmap_mode('plot') #switch back static mapsIt is advised to always switch back to plot mode to save connection consumption and limit the number of interactive maps to 10 in one documents when publishing.
4.2 Geospatial Data Wrangling
In this section, the data (sf objects) will be converted to spatstat data structure: ppp (for point data) and owin for observation window.
Here we use as.ppp() of spatstat package to covert the point data childcare_sf to ppp file, confirm the change using class() and have a quick overview of the data statistics via summary().
childcare_ppp <- as.ppp(childcare_sf)
class(childcare_ppp)[1] "ppp"
summary(childcare_ppp)Marked planar point pattern: 1925 points
Average intensity 2.417323e-06 points per square unit
Coordinates are given to 11 decimal places
Mark variables: Name, Description
Summary:
Name Description
Length:1925 Length:1925
Class :character Class :character
Mode :character Mode :character
Window: rectangle = [11810.03, 45404.24] x [25596.33, 49300.88] units
(33590 x 23700 units)
Window area = 796335000 square units
plot(unmark(childcare_ppp), main = "childcare_ppp") #drops the marks, since simple plot() is not displaying properly showing broken tags <th>, <td>, <table> etc
Before moving forwards, let’s check if there are any duplicated points.
any(duplicated(childcare_ppp))[1] FALSE
Similarly, the owin object can be created using the function as.owin() for polygon data. After the conversion, the class() and plot() functions can be used to verify that the object is of the correct class and that the data retains its original shape.
sg_owin <- as.owin(mpsz_cl)
class(sg_owin)[1] "owin"
plot(sg_owin)
The code chunk below combines ppp and owin into one ppp file which means it updates the window of childcare_ppp to sg_owin and keeps the points that fall inside.
childcareSG_PPP = childcare_ppp[sg_owin]
class(childcareSG_PPP)[1] "ppp"
childcareSG_PPPMarked planar point pattern: 1925 points
Mark variables: Name, Description
window: polygonal boundary
enclosing rectangle: [2667.54, 55941.94] x [21448.47, 50256.33] units
summary(childcareSG_PPP)Marked planar point pattern: 1925 points
Average intensity 2.875208e-06 points per square unit
Coordinates are given to 11 decimal places
Mark variables: Name, Description
Summary:
Name Description
Length:1925 Length:1925
Class :character Class :character
Mode :character Mode :character
Window: polygonal boundary
41 separate polygons (26 holes)
vertices area relative.area
polygon 1 285 1.61128e+06 2.41e-03
polygon 2 27 1.50315e+04 2.25e-05
polygon 3 (hole) 41 -4.01660e+04 -6.00e-05
polygon 4 (hole) 317 -5.11280e+04 -7.64e-05
polygon 5 (hole) 3 -4.14100e-04 -6.19e-13
polygon 6 30 2.80002e+04 4.18e-05
polygon 7 (hole) 4 -2.86396e-01 -4.28e-10
polygon 8 (hole) 3 -1.81439e-04 -2.71e-13
polygon 9 (hole) 3 -5.99531e-04 -8.95e-13
polygon 10 (hole) 3 -3.04560e-04 -4.55e-13
polygon 11 (hole) 3 -4.46108e-04 -6.66e-13
polygon 12 (hole) 5 -2.44408e-04 -3.65e-13
polygon 13 (hole) 5 -3.64686e-02 -5.45e-11
polygon 14 71 8.18750e+03 1.22e-05
polygon 15 (hole) 38 -7.79904e+03 -1.16e-05
polygon 16 91 1.49663e+04 2.24e-05
polygon 17 (hole) 395 -7.38124e+03 -1.10e-05
polygon 18 40 1.38607e+04 2.07e-05
polygon 19 (hole) 11 -8.36705e+01 -1.25e-07
polygon 20 (hole) 3 -2.33435e-03 -3.49e-12
polygon 21 45 2.51218e+03 3.75e-06
polygon 22 139 3.22293e+03 4.81e-06
polygon 23 148 3.10395e+03 4.64e-06
polygon 24 (hole) 4 -1.72650e-04 -2.58e-13
polygon 25 75 1.73526e+04 2.59e-05
polygon 26 83 5.28920e+03 7.90e-06
polygon 27 106 3.04104e+03 4.54e-06
polygon 28 71 5.63061e+03 8.41e-06
polygon 29 10 1.99717e+02 2.98e-07
polygon 30 (hole) 3 -1.37223e-02 -2.05e-11
polygon 31 (hole) 3 -8.68789e-04 -1.30e-12
polygon 32 (hole) 3 -3.39815e-04 -5.08e-13
polygon 33 (hole) 3 -4.52041e-05 -6.75e-14
polygon 34 (hole) 3 -3.90173e-05 -5.83e-14
polygon 35 (hole) 3 -9.59845e-05 -1.43e-13
polygon 36 (hole) 8 -4.28707e-01 -6.40e-10
polygon 37 (hole) 4 -2.18619e-04 -3.27e-13
polygon 38 (hole) 6 -8.37554e-01 -1.25e-09
polygon 39 (hole) 5 -2.92235e-04 -4.36e-13
polygon 40 14053 6.67892e+08 9.98e-01
polygon 41 (hole) 3 -7.43616e-06 -1.11e-14
enclosing rectangle: [2667.54, 55941.94] x [21448.47, 50256.33] units
(53270 x 28810 units)
Window area = 669517000 square units
Fraction of frame area: 0.436
plot(unmark(childcareSG_PPP), main = "childcare_SG_PPP")
4.3 Extracting the Study Area
We focus on the childcare centres in the four areas: Punggol, Tampines, Choa Chu Kang and Jurong West.
The code chunk uses filter() to create a new variable for each area and plot() for quick preview.
pg <- mpsz_cl %>%
filter(PLN_AREA_N == "PUNGGOL")
tm <- mpsz_cl %>%
filter(PLN_AREA_N == "TAMPINES")
ck <- mpsz_cl %>%
filter(PLN_AREA_N == "CHOA CHU KANG")
jw <- mpsz_cl %>%
filter(PLN_AREA_N == "JURONG WEST")par(mfrow=c(2,2))
plot(st_geometry(pg), main = "Ponggol")
plot(st_geometry(tm), main = "Tampines")
plot(st_geometry(ck), main = "Choa Chu Kang")
plot(st_geometry(jw), main = "Jurong West")
pg_owin = as.owin(pg)
tm_owin = as.owin(tm)
ck_owin = as.owin(ck)
jw_owin = as.owin(jw)The code chunk below subsets the dataset to the study areas, rescales the unit of measurement from metre to kilometre and finally plot the areas with childcare points.
childcare_pg_ppp = childcare_ppp[pg_owin] #crop childcare points to Punggol
childcare_tm_ppp = childcare_ppp[tm_owin]
childcare_ck_ppp = childcare_ppp[ck_owin]
childcare_jw_ppp = childcare_ppp[jw_owin]par(mfrow=c(2,2))
plot(unmark(childcare_pg_ppp),
main = "Punggol")
plot(unmark(childcare_tm_ppp),
main = "Tampines")
plot(unmark(childcare_ck_ppp),
main = "Choa Chu Kang")
plot(unmark(childcare_jw_ppp),
main = "Jurong West")
5 Second-order Spatial Point Patterns Analysis
Let us examine the relationships between points at subzone level.
6 Analysing Spatial Point Process Using G-Function
The G-function Gest() shows distance from an event to its nearest event, with Monte Carlo simulation envelop() used to test against complete spatial randomness (CSR).
First we set a fixed random seed to ensure reproducibility.
set.seed(1234)The code chunk below calculates the G-function.
The plot shows childcare centres are clustered within 300 metres but rather random beyond the scale.
G_CK = Gest(childcare_ck_ppp, correction = "border") #correction is border
G_CKFunction value object (class 'fv')
for the function r -> G(r)
......................................................
Math.label Description
r r distance argument r
theo G[pois](r) theoretical Poisson G(r)
rs hat(G)[bord](r) border corrected estimate of G(r)
......................................................
Default plot formula: .~r
where "." stands for 'rs', 'theo'
Recommended range of argument r: [0, 228.98]
Available range of argument r: [0, 550.4]
plot(G_CK,xlim= c(0,500)) #xlim zooms the x-axis, CDF rises above csr before 300 and crosses after 300
The code chunk below test Complete Spatial Randomness.
H0 = The distribution of childcare services at Choa Chu Kang are randomly distributed.
H1 = The distribution of childcare services at Choa Chu Kang are not randomly distributed.
The null hypothesis will be rejected is p-value is smaller than alpha value of 0.001.
Since the number of simulations is 999, for two-sided test, nrank = (nsim + 1) * 0.002 /2
nrank is 1 at default. This is also applicable to the following tests.
G_CK.csr <- envelope(childcare_ck_ppp, Gest, nsim= 999)Generating 999 simulations of CSR ...
1, 2, 3, ......10.........20.........30.........40.........50.........60..
.......70.........80.........90.........100.........110.........120.........130
.........140.........150.........160.........170.........180.........190........
.200.........210.........220.........230.........240.........250.........260......
...270.........280.........290.........300.........310.........320.........330....
.....340.........350.........360.........370.........380.........390.........400..
.......410.........420.........430.........440.........450.........460.........470
.........480.........490.........500.........510.........520.........530........
.540.........550.........560.........570.........580.........590.........600......
...610.........620.........630.........640.........650.........660.........670....
.....680.........690.........700.........710.........720.........730.........740..
.......750.........760.........770.........780.........790.........800.........810
.........820.........830.........840.........850.........860.........870........
.880.........890.........900.........910.........920.........930.........940......
...950.........960.........970.........980.........990........
999.
Done.
G_CK.csrPointwise critical envelopes for G(r)
and observed value for 'childcare_ck_ppp'
Edge correction: "km"
Obtained from 999 simulations of CSR
Alternative: two.sided
Significance level of pointwise Monte Carlo test: 2/1000 = 0.002
.....................................................................
Math.label Description
r r distance argument r
obs hat(G)[obs](r) observed value of G(r) for data pattern
theo G[theo](r) theoretical value of G(r) for CSR
lo hat(G)[lo](r) lower pointwise envelope of G(r) from simulations
hi hat(G)[hi](r) upper pointwise envelope of G(r) from simulations
.....................................................................
Default plot formula: .~r
where "." stands for 'obs', 'theo', 'hi', 'lo'
Columns 'lo' and 'hi' will be plotted as shading (by default)
Recommended range of argument r: [0, 220.38]
Available range of argument r: [0, 550.4]
plot(G_CK.csr, xlim= c(0,300))
Interpretation: The G-function for Choa Chu Kang (KM correction; 999 CSR simulations; two-sided α = 0.002) shows small significant short-range clustering for distances below 100 m. From 100 to 220 m, curve hovers near the upper edge but stays inside, that indicates a tendency toward clustering but not statistically significant.
In Choa Chu Kang, centres are significantly more clustered than random within about 100 m. Between 100 and 220 m, the pattern looks random. Overall the centres are mostly randomly distributed with tendency to clustering.
The code chunk below calculates the G-function.
G_tm = Gest(childcare_tm_ppp, correction = "best") #correction is selected by package
G_tmFunction value object (class 'fv')
for the function r -> G(r)
...................................................................
Math.label Description
r r distance argument r
theo G[pois](r) theoretical Poisson G(r)
km hat(G)[km](r) Kaplan-Meier estimate of G(r)
hazard hat(h)[km](r) Kaplan-Meier estimate of hazard function h(r)
theohaz h[pois](r) theoretical Poisson hazard function h(r)
...................................................................
Default plot formula: .~r
where "." stands for 'km', 'theo'
Recommended range of argument r: [0, 258.51]
Available range of argument r: [0, 807.07]
plot(G_tm, xlim = c(0,300)) #xlim changes the scale
The code chunk below test Complete Spatial Randomness.
Ho = The distribution of childcare services at Tampines are randomly distributed.
H1 = The distribution of childcare services at Tampines are not randomly distributed.
The null hypothesis will be rejected is p-value is smaller than alpha value of 0.001.
Since the number of simulations is 999, for two-sided test, nrank = (nsim + 1) * 0.002 /2
nrank is 1 at default. This is also applicable to the following tests.
G_tm.csr <- envelope(childcare_tm_ppp, Gest, Correction = "all", nsim = 999)Generating 999 simulations of CSR ...
1, 2, 3, ......10.........20.........30.........40.........50.........60..
.......70.........80.........90.........100.........110.........120.........130
.........140.........150.........160.........170.........180.........190........
.200.........210.........220.........230.........240.........250.........260......
...270.........280.........290.........300.........310.........320.........330....
.....340.........350.........360.........370.........380.........390.........400..
.......410.........420.........430.........440.........450.........460.........470
.........480.........490.........500.........510.........520.........530........
.540.........550.........560.........570.........580.........590.........600......
...610.........620.........630.........640.........650.........660.........670....
.....680.........690.........700.........710.........720.........730.........740..
.......750.........760.........770.........780.........790.........800.........810
.........820.........830.........840.........850.........860.........870........
.880.........890.........900.........910.........920.........930.........940......
...950.........960.........970.........980.........990........
999.
Done.
G_tm.csrPointwise critical envelopes for G(r)
and observed value for 'childcare_tm_ppp'
Edge correction: "km"
Obtained from 999 simulations of CSR
Alternative: two.sided
Significance level of pointwise Monte Carlo test: 2/1000 = 0.002
.....................................................................
Math.label Description
r r distance argument r
obs hat(G)[obs](r) observed value of G(r) for data pattern
theo G[theo](r) theoretical value of G(r) for CSR
lo hat(G)[lo](r) lower pointwise envelope of G(r) from simulations
hi hat(G)[hi](r) upper pointwise envelope of G(r) from simulations
.....................................................................
Default plot formula: .~r
where "." stands for 'obs', 'theo', 'hi', 'lo'
Columns 'lo' and 'hi' will be plotted as shading (by default)
Recommended range of argument r: [0, 258.51]
Available range of argument r: [0, 807.07]
plot(G_tm.csr, xlim = c(0,300))
Interpretation: The plot of Monte Carlo simulations (999 CSR sims, KM correction, two sided, 99.8% confidence) show the observed G in Tampines is mostly above the significance band, which implies centres concentrate within recommended range 258m, especially packed within first 60m.
7 Analysing Spatial Point Process Using F-Function
The F-function Fest() estimates the distribution of empty-space distances, which means the distance from a random location to the nearest event. It tells about the gaps. Additionally Monte Carlo envelopes are used to assess complete spatial randomness.
F_CK = Fest(childcare_ck_ppp, correction = "best")
F_CKFunction value object (class 'fv')
for the function r -> F(r)
...................................................................
Math.label Description
r r distance argument r
theo F[pois](r) theoretical Poisson F(r)
km hat(F)[km](r) Kaplan-Meier estimate of F(r)
hazard hat(h)[km](r) Kaplan-Meier estimate of hazard function h(r)
theohaz h[pois](r) theoretical Poisson hazard h(r)
...................................................................
Default plot formula: .~r
where "." stands for 'km', 'theo'
Recommended range of argument r: [0, 304.33]
Available range of argument r: [0, 552.75]
plot(F_CK, xlim= c(0,500))
F_CK.csr <- envelope(childcare_ck_ppp, Fest, correction = "all", nsim = 999)Generating 999 simulations of CSR ...
1, 2, 3, ......10.........20.........30.........40.........50.........60..
.......70.........80.........90.........100.........110.........120.........130
.........140.........150.........160.........170.........180.........190........
.200.........210.........220.........230.........240.........250.........260......
...270.........280.........290.........300.........310.........320.........330....
.....340.........350.........360.........370.........380.........390.........400..
.......410.........420.........430.........440.........450.........460.........470
.........480.........490.........500.........510.........520.........530........
.540.........550.........560.........570.........580.........590.........600......
...610.........620.........630.........640.........650.........660.........670....
.....680.........690.........700.........710.........720.........730.........740..
.......750.........760.........770.........780.........790.........800.........810
.........820.........830.........840.........850.........860.........870........
.880.........890.........900.........910.........920.........930.........940......
...950.........960.........970.........980.........990........
999.
Done.
F_CK.csrPointwise critical envelopes for F(r)
and observed value for 'childcare_ck_ppp'
Edge correction: "km"
Obtained from 999 simulations of CSR
Alternative: two.sided
Significance level of pointwise Monte Carlo test: 2/1000 = 0.002
.....................................................................
Math.label Description
r r distance argument r
obs hat(F)[obs](r) observed value of F(r) for data pattern
theo F[theo](r) theoretical value of F(r) for CSR
lo hat(F)[lo](r) lower pointwise envelope of F(r) from simulations
hi hat(F)[hi](r) upper pointwise envelope of F(r) from simulations
.....................................................................
Default plot formula: .~r
where "." stands for 'obs', 'theo', 'hi', 'lo'
Columns 'lo' and 'hi' will be plotted as shading (by default)
Recommended range of argument r: [0, 304.33]
Available range of argument r: [0, 552.75]
plot(F_CK.csr, xlim=c(0,500))
Interpretation: Choa Chu Kang: F-function (KM correction; 999 CSR simulations; pointwise 99.8% envelope, α = 0.002):
The observed curve lies below the CSR line but remains inside the envelope across the recommended range (0,304m), which indicates no statistically significant departure from CSR. Overall, the centres are broadly consistent with a random (CSR) pattern, with only a mild tendency toward larger empty-space distances.
The code chunk below calculates the F-function.
F_tm = Fest(childcare_tm_ppp, correction = "best")
F_tmFunction value object (class 'fv')
for the function r -> F(r)
...................................................................
Math.label Description
r r distance argument r
theo F[pois](r) theoretical Poisson F(r)
km hat(F)[km](r) Kaplan-Meier estimate of F(r)
hazard hat(h)[km](r) Kaplan-Meier estimate of hazard function h(r)
theohaz h[pois](r) theoretical Poisson hazard h(r)
...................................................................
Default plot formula: .~r
where "." stands for 'km', 'theo'
Recommended range of argument r: [0, 819.49]
Available range of argument r: [0, 819.49]
plot(F_tm)
The code chunk below test Complete Spatial Randomness.
Ho = The distribution of childcare services at Tampines are randomly distributed.
H1 = The distribution of childcare services at Tampines are not randomly distributed.
The null hypothesis will be rejected is p-value is smaller than alpha value of 0.001.
F_tm.csr = envelope(childcare_tm_ppp, Fest, correction = "all", nsim = 999)Generating 999 simulations of CSR ...
1, 2, 3, ......10.........20.........30.........40.........50.........60..
.......70.........80.........90.........100.........110.........120.........130
.........140.........150.........160.........170.........180.........190........
.200.........210.........220.........230.........240.........250.........260......
...270.........280.........290.........300.........310.........320.........330....
.....340.........350.........360.........370.........380.........390.........400..
.......410.........420.........430.........440.........450.........460.........470
.........480.........490.........500.........510.........520.........530........
.540.........550.........560.........570.........580.........590.........600......
...610.........620.........630.........640.........650.........660.........670....
.....680.........690.........700.........710.........720.........730.........740..
.......750.........760.........770.........780.........790.........800.........810
.........820.........830.........840.........850.........860.........870........
.880.........890.........900.........910.........920.........930.........940......
...950.........960.........970.........980.........990........
999.
Done.
F_tm.csrPointwise critical envelopes for F(r)
and observed value for 'childcare_tm_ppp'
Edge correction: "km"
Obtained from 999 simulations of CSR
Alternative: two.sided
Significance level of pointwise Monte Carlo test: 2/1000 = 0.002
.....................................................................
Math.label Description
r r distance argument r
obs hat(F)[obs](r) observed value of F(r) for data pattern
theo F[theo](r) theoretical value of F(r) for CSR
lo hat(F)[lo](r) lower pointwise envelope of F(r) from simulations
hi hat(F)[hi](r) upper pointwise envelope of F(r) from simulations
.....................................................................
Default plot formula: .~r
where "." stands for 'obs', 'theo', 'hi', 'lo'
Columns 'lo' and 'hi' will be plotted as shading (by default)
Recommended range of argument r: [0, 819.49]
Available range of argument r: [0, 819.49]
plot(F_tm.csr)
Interpretation: Tampines: The F-function (KM; 999 CSR sims; pointwise α = 0.002):
From 250m onward, the observed F curve drops below the lower envelope (pointwise a = 0.002) which means it is highly likely there are large caps from any centre over 250m. Together with the G-function evidence of significant short-range clustering, the results suggest a cluster–hole pattern: centres concentrate around a few hotspots, leaving larger under-covered areas farther from any centre.
8 Analysing Spatial Point Process Using K-Function
K-function measures the number of points found up to a given distance of any particular point. Similarly here we first compute the K-function estimates using Kest() then test the K-function against CSR.
The code chunk below calculates the K-function.
K_CK = Kest(childcare_ck_ppp, correction = "Ripley")
K_CKFunction value object (class 'fv')
for the function r -> K(r)
................................................................
Math.label Description
r r distance argument r
theo K[pois](r) theoretical Poisson K(r)
iso hat(K)[iso](r) Ripley isotropic correction estimate of K(r)
................................................................
Default plot formula: .~r
where "." stands for 'iso', 'theo'
Recommended range of argument r: [0, 794.97]
Available range of argument r: [0, 794.97]
plot(K_CK, .-r ~ r, ylab = "K(d) - r", xlab = "d(m)") #plot (each y-series minus r) against r.
The code chunk below test Complete Spatial Randomness.
Ho = The distribution of childcare services at Tampines are randomly distributed.
H1 = The distribution of childcare services at Tampines are not randomly distributed.
The null hypothesis will be rejected is p-value is smaller than alpha value of 0.001.
K_CK.csr <- envelope(childcare_ck_ppp, Kest, nsim = 99)Generating 99 simulations of CSR ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98,
99.
Done.
K_CK.csrPointwise critical envelopes for K(r)
and observed value for 'childcare_ck_ppp'
Edge correction: "iso"
Obtained from 99 simulations of CSR
Alternative: two.sided
Significance level of pointwise Monte Carlo test: 2/100 = 0.02
.....................................................................
Math.label Description
r r distance argument r
obs hat(K)[obs](r) observed value of K(r) for data pattern
theo K[theo](r) theoretical value of K(r) for CSR
lo hat(K)[lo](r) lower pointwise envelope of K(r) from simulations
hi hat(K)[hi](r) upper pointwise envelope of K(r) from simulations
.....................................................................
Default plot formula: .~r
where "." stands for 'obs', 'theo', 'hi', 'lo'
Columns 'lo' and 'hi' will be plotted as shading (by default)
Recommended range of argument r: [0, 794.97]
Available range of argument r: [0, 794.97]
plot(K_CK.csr, . -r ~ r, xlab = "d", ylab = "K(d) - r") #plot (each y-series minus r) against r.
Interpretation: Choa Chu Kang: K-function,isotropic edge correction; 99 CSR simulations; pointwise α = 0.02 The black curve remains close within the envelope over 0 to 500 m, so we do not detect a significant departure from CSR; from 500m to 800m, we find the slight positive deviation. Overall it suggests only a weak, non-significant clustering tendency.
K_tm = Kest(childcare_tm_ppp, correction = "Ripley")
plot(K_tm, . -r ~ r,
ylab= "K(d)-r", xlab = "d(m)", #plot (each y-series minus r) against r.
xlim=c(0,1000))
The code chunk below test Complete Spatial Randomness.
Ho = The distribution of childcare services at Tampines are randomly distributed.
H1 = The distribution of childcare services at Tampines are not randomly distributed.
The null hypothesis will be rejected is p-value is smaller than alpha value of 0.001.
K_tm.csr <- envelope(childcare_tm_ppp, Kest, nsim = 99, rank = 1)Generating 99 simulations of CSR ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98,
99.
Done.
K_tm.csrPointwise critical envelopes for K(r)
and observed value for 'childcare_tm_ppp'
Edge correction: "iso"
Obtained from 99 simulations of CSR
Alternative: two.sided
Significance level of pointwise Monte Carlo test: 2/100 = 0.02
.....................................................................
Math.label Description
r r distance argument r
obs hat(K)[obs](r) observed value of K(r) for data pattern
theo K[theo](r) theoretical value of K(r) for CSR
lo hat(K)[lo](r) lower pointwise envelope of K(r) from simulations
hi hat(K)[hi](r) upper pointwise envelope of K(r) from simulations
.....................................................................
Default plot formula: .~r
where "." stands for 'obs', 'theo', 'hi', 'lo'
Columns 'lo' and 'hi' will be plotted as shading (by default)
Recommended range of argument r: [0, 1665]
Available range of argument r: [0, 1665]
plot(K_tm.csr, .- r~ r, xlab = "d", ylab = "K(d) - r", xlim = c(0, 1700)) #plot (each y-series minus r) against r.
Interpretation: Tampines: K-function,isotropic edge correction; 99 CSR simulations; pointwise α = 0.02
The black curve rises above the envelope over 0 to 1665 m, which shows a significant departure from CSR. K suggests clustering across many scales,
9 Analysing Spatial Point Process Using L-Function
Here we use K function- normalised version L-function Lest() and test against CSR envelope().
The code chunk below calculates the L-function.
L_CK = Lest(childcare_ck_ppp, correction = "Ripley")
L_CKFunction value object (class 'fv')
for the function r -> L(r)
................................................................
Math.label Description
r r distance argument r
theo L[pois](r) theoretical Poisson L(r)
iso hat(L)[iso](r) Ripley isotropic correction estimate of L(r)
................................................................
Default plot formula: .~r
where "." stands for 'iso', 'theo'
Recommended range of argument r: [0, 794.97]
Available range of argument r: [0, 794.97]
plot(L_CK, .- r~ r, ylab = "L(d) - r", xlab = "d(m)")
The code chunk below test Complete Spatial Randomness.
Ho = The distribution of childcare services at Tampines are randomly distributed.
H1 = The distribution of childcare services at Tampines are not randomly distributed.
The null hypothesis will be rejected is p-value is smaller than alpha value of 0.001.
L_CK.csr <- envelope(childcare_ck_ppp, Lest, nsim = 99)Generating 99 simulations of CSR ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98,
99.
Done.
L_CK.csrPointwise critical envelopes for L(r)
and observed value for 'childcare_ck_ppp'
Edge correction: "iso"
Obtained from 99 simulations of CSR
Alternative: two.sided
Significance level of pointwise Monte Carlo test: 2/100 = 0.02
.....................................................................
Math.label Description
r r distance argument r
obs hat(L)[obs](r) observed value of L(r) for data pattern
theo L[theo](r) theoretical value of L(r) for CSR
lo hat(L)[lo](r) lower pointwise envelope of L(r) from simulations
hi hat(L)[hi](r) upper pointwise envelope of L(r) from simulations
.....................................................................
Default plot formula: .~r
where "." stands for 'obs', 'theo', 'hi', 'lo'
Columns 'lo' and 'hi' will be plotted as shading (by default)
Recommended range of argument r: [0, 794.97]
Available range of argument r: [0, 794.97]
plot(L_CK.csr, . - r ~ r, xlab="d", ylab="L(d)-r")
Interpretation:
Choa Chu Kang: L-function,isotropic edge correction; 99 CSR simulations; pointwise α = 0.02
The black curve is mostly inside the envelope across r within 795m, except one upside shape deviation around 60m. F function esitmates is mostly positive and within the pointwise envelope, so Choa Chu Kang shows no significant departure from CSR (only a slight clustering tendency).
The code chunk below calculates the L-function.
L_tm = Lest(childcare_tm_ppp, correction = "Ripley")
L_tmFunction value object (class 'fv')
for the function r -> L(r)
................................................................
Math.label Description
r r distance argument r
theo L[pois](r) theoretical Poisson L(r)
iso hat(L)[iso](r) Ripley isotropic correction estimate of L(r)
................................................................
Default plot formula: .~r
where "." stands for 'iso', 'theo'
Recommended range of argument r: [0, 1665]
Available range of argument r: [0, 1665]
plot(L_tm, .- r~ r, ylab = "L(d) - r", xlab = "d(m)", xlim = c(0,1000))
The code chunk below test Complete Spatial Randomness.
Ho = The distribution of childcare services at Tampines are randomly distributed.
H1 = The distribution of childcare services at Tampines are not randomly distributed.
The null hypothesis will be rejected is p-value is smaller than alpha value of 0.001.
L_tm.csr <- envelope(childcare_tm_ppp,Lest, nsim = 99, nrank = 1)Generating 99 simulations of CSR ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98,
99.
Done.
L_tm.csrPointwise critical envelopes for L(r)
and observed value for 'childcare_tm_ppp'
Edge correction: "iso"
Obtained from 99 simulations of CSR
Alternative: two.sided
Significance level of pointwise Monte Carlo test: 2/100 = 0.02
.....................................................................
Math.label Description
r r distance argument r
obs hat(L)[obs](r) observed value of L(r) for data pattern
theo L[theo](r) theoretical value of L(r) for CSR
lo hat(L)[lo](r) lower pointwise envelope of L(r) from simulations
hi hat(L)[hi](r) upper pointwise envelope of L(r) from simulations
.....................................................................
Default plot formula: .~r
where "." stands for 'obs', 'theo', 'hi', 'lo'
Columns 'lo' and 'hi' will be plotted as shading (by default)
Recommended range of argument r: [0, 1665]
Available range of argument r: [0, 1665]
plot(L_tm.csr, .- r~ r, ylab = "L(d) - r", xlab = "d", xlim = c(0,1700))
Interpretation:
Tampines: L-function,isotropic edge correction; 99 CSR simulations; pointwise α = 0.02
The black line lies above the simulation envelope, indicating strong clustering at scales; in view of F showing large gaps, this likely reflects hotspot-driven inhomogeneity.
10 Analysing Spatial Point Process Using J-Function
There is also another test mentioned in the lecture videos called J-function Jest(), which is combination of G and F. J(r) = [1 - G(r)]/ [1 - F(r)]. J(r) > 1 implies regluar, = 1 CSR, < 1 cluster. When the observed line is below envelope suggests clustering, when it is over envelope suggests regular pattern.
Here we try out Jest() and test against CSR envelope().
The code chunk below calculates the J-function.
J_CK = Jest(childcare_ck_ppp, correction = "best")
J_CKFunction value object (class 'fv')
for the function r -> J(r)
.....................................................................
Math.label Description
r r distance argument r
theo J[pois](r) theoretical Poisson J(r)
km hat(J)[km](r) Kaplan-Meier estimate of J(r)
hazard hazard(r) Kaplan-Meier estimate of derivative of log(J(r))
.....................................................................
Default plot formula: .~r
where "." stands for 'km', 'theo'
Recommended range of argument r: [0, 305.3]
Available range of argument r: [0, 550.4]
plot(J_CK)
The code chunk below test Complete Spatial Randomness.
Ho = The distribution of childcare services at Tampines are randomly distributed.
H1 = The distribution of childcare services at Tampines are not randomly distributed.
The null hypothesis will be rejected is p-value is smaller than alpha value of 0.001.
J_CK.csr <- envelope(childcare_ck_ppp, Jest, nsim = 999)Generating 999 simulations of CSR ...
1, 2, 3, ......10.........20.........30.........40.........50.........60..
.......70.........80.........90.........100.........110.........120.........130
.........140.........150.........160.........170.........180.........190........
.200.........210.........220.........230.........240.........250.........260......
...270.........280.........290.........300.........310.........320.........330....
.....340.........350.........360.........370.........380.........390.........400..
.......410.........420.........430.........440.........450.........460.........470
.........480.........490.........500.........510.........520.........530........
.540.........550.........560.........570.........580.........590.........600......
...610.........620.........630.........640.........650.........660.........670....
.....680.........690.........700.........710.........720.........730.........740..
.......750.........760.........770.........780.........790.........800.........810
.........820.........830.........840.........850.........860.........870........
.880.........890.........900.........910.........920.........930.........940......
...950.........960.........970.........980.........990........
999.
Done.
J_CK.csrPointwise critical envelopes for J(r)
and observed value for 'childcare_ck_ppp'
Edge correction: "km"
Obtained from 999 simulations of CSR
Alternative: two.sided
Significance level of pointwise Monte Carlo test: 2/1000 = 0.002
.....................................................................
Math.label Description
r r distance argument r
obs hat(J)[obs](r) observed value of J(r) for data pattern
theo J[theo](r) theoretical value of J(r) for CSR
lo hat(J)[lo](r) lower pointwise envelope of J(r) from simulations
hi hat(J)[hi](r) upper pointwise envelope of J(r) from simulations
.....................................................................
Default plot formula: .~r
where "." stands for 'obs', 'theo', 'hi', 'lo'
Columns 'lo' and 'hi' will be plotted as shading (by default)
Recommended range of argument r: [0, 305.3]
Available range of argument r: [0, 550.4]
plot(J_CK.csr, xlim = c(0,300))
Interpretation:
Choa Chu Kang: J-function, KM edge correction; 999 CSR simulations; pointwise α = 0.002
The black curve is mostly inside the envelope but it is hard to interpret.
The code chunk below calculates the J-function.
J_tm = Jest(childcare_tm_ppp, correction = "best")
J_tmFunction value object (class 'fv')
for the function r -> J(r)
.....................................................................
Math.label Description
r r distance argument r
theo J[pois](r) theoretical Poisson J(r)
km hat(J)[km](r) Kaplan-Meier estimate of J(r)
hazard hazard(r) Kaplan-Meier estimate of derivative of log(J(r))
.....................................................................
Default plot formula: .~r
where "." stands for 'km', 'theo'
Recommended range of argument r: [0, 807.07]
Available range of argument r: [0, 807.07]
plot(J_tm)
The code chunk below test Complete Spatial Randomness.
Ho = The distribution of childcare services at Tampines are randomly distributed.
H1 = The distribution of childcare services at Tampines are not randomly distributed.
The null hypothesis will be rejected is p-value is smaller than alpha value of 0.001.
J_tm.csr <- envelope(childcare_tm_ppp, Jest, nsim = 999, nrank = 1)Generating 999 simulations of CSR ...
1, 2, 3, ......10.........20.........30.........40.........50.........60..
.......70.........80.........90.........100.........110.........120.........130
.........140.........150.........160.........170.........180.........190........
.200.........210.........220.........230.........240.........250.........260......
...270.........280.........290.........300.........310.........320.........330....
.....340.........350.........360.........370.........380.........390.........400..
.......410.........420.........430.........440.........450.........460.........470
.........480.........490.........500.........510.........520.........530........
.540.........550.........560.........570.........580.........590.........600......
...610.........620.........630.........640.........650.........660.........670....
.....680.........690.........700.........710.........720.........730.........740..
.......750.........760.........770.........780.........790.........800.........810
.........820.........830.........840.........850.........860.........870........
.880.........890.........900.........910.........920.........930.........940......
...950.........960.........970.........980.........990........
999.
Done.
J_tm.csrPointwise critical envelopes for J(r)
and observed value for 'childcare_tm_ppp'
Edge correction: "km"
Obtained from 999 simulations of CSR
Alternative: two.sided
Significance level of pointwise Monte Carlo test: 2/1000 = 0.002
.....................................................................
Math.label Description
r r distance argument r
obs hat(J)[obs](r) observed value of J(r) for data pattern
theo J[theo](r) theoretical value of J(r) for CSR
lo hat(J)[lo](r) lower pointwise envelope of J(r) from simulations
hi hat(J)[hi](r) upper pointwise envelope of J(r) from simulations
.....................................................................
Default plot formula: .~r
where "." stands for 'obs', 'theo', 'hi', 'lo'
Columns 'lo' and 'hi' will be plotted as shading (by default)
Recommended range of argument r: [0, 807.07]
Available range of argument r: [0, 807.07]
plot(J_tm.csr)
Interpretation:
Tampines: J-function, KM edge correction; 999 CSR simulations; pointwise α = 0.002
It is hard to interpret.
11 Summary
Through multiple tests (GFKL functions) against complete spatial randomness, we can conclude the childcare centres in Choa Chu Kang are highly like randomly distributed whereas the centres in Tampines are largely clustered.
12 Reference
Kam, T. S. 2nd Order Spatial Point Patterns Analysis Methods. R for Geospatial Data Science and Analytics. https://r4gdsa.netlify.app/chap05